The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language

نویسندگان

  • ehsan pazouki Assistant Professor of Artificial Intelligence, Faculty of Computer Engineering, Shahid Rajaei Teacher Training University, Tehran, Iran
  • Marziyeh Taleghani MA in Translation Studies, Faculty of Persian Literature and Foreign Languages, South Tehran Branch of Azad University Iran
  • Vahid Ghahraman Assistant Professor of TESOL, Iran Encyclopedia Compiling Foundation, Tehran, Iran
چکیده مقاله:

Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine translated Persian texts. This study focused on answering three main questions, which included the extent that Automatic Machine Translation Evaluation Metrics is valid on evaluating translated Persian texts; the probable significant correlation between human evaluation and automatic evaluation metrics in evaluating English to Persian translations; and the best predictor of human judgment. For this purpose, a dataset containing 200 English sentences and their four reference human translations, was translated using four different statistical Machine translation systems. The results of these systems were evaluated by seven automatic MTEMs and three human evaluators. Then the correlations of metrics and human evaluators were calculated using both Spearman and Kendall correlation coefficients. The result of the study confirmed the relatively high correlation of MTEMs with human evaluation on Persian language where GTM proved to be more efficient compared with other metrics.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the evaluation of language related engagment and task related engagment with the purpose of investigating the effect of metatalk and task typology

abstract while task-based instruction is considered as the most effective way to learn a language in the related literature, it is oversimplified on various grounds. different variables may affect how students are engaged with not only the language but also with the task itself. the present study was conducted to investigate language and task related engagement on the basis of the task typolog...

15 صفحه اول

Statistical Machine Translation as a Grammar Checker for Persian Language

Existence of automatic writing assistance tools such as spell and grammar checker/corrector can help in increasing electronic texts with higher quality by removing noises and cleaning the sentences. Different kinds of errors in a text can be categorized into spelling, grammatical and real-word errors. In this article, the concepts of an automatic grammar checker for Persian (Farsi) language, is...

متن کامل

Improved Language Modeling for English-Persian Statistical Machine Translation

As interaction between speakers of different languages continues to increase, the everpresent problem of language barriers must be overcome. For the same reason, automatic language translation (Machine Translation) has become an attractive area of research and development. Statistical Machine Translation (SMT) has been used for translation between many language pairs, the results of which have ...

متن کامل

Evaluation Metrics for Knowledge-Based Machine Translation

A methodology is presented for component-based machine translation (MT) evaluation through causal error analysis to complement existing global evaluation methods. This methodology is particularly appropriate for knowledge-based machine translation (KBMT) systems. After a discussion of MT evaluation criteria and the particular evaluation metrics proposed for KBMT, we apply this methodology to a ...

متن کامل

Meteor, M-BLEU and M-TER: Evaluation Metrics for High-Correlation with Human Rankings of Machine Translation Output

This paper describes our submissions to the machine translation evaluation shared task in ACL WMT-08. Our primary submission is the Meteor metric tuned for optimizing correlation with human rankings of translation hypotheses. We show significant improvement in correlation as compared to the earlier version of metric which was tuned to optimized correlation with traditional adequacy and fluency ...

متن کامل

A Poor Man’s Translation Memory Using Machine Translation Evaluation Metrics

We propose straightforward implementations of translation memory (TM) functionality for research purposes, using machine translation evaluation metrics as similarity functions. Experiments under various conditions demonstrate the effectiveness of the approach, but also highlight problems in evaluating the results using an MT evaluation methodology.

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}


عنوان ژورنال

دوره 9  شماره 3

صفحات  43- 55

تاریخ انتشار 2019-10-01

با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.

میزبانی شده توسط پلتفرم ابری doprax.com

copyright © 2015-2023